Topic Detection through Statistical Methods
نویسندگان
چکیده
A system is developed to group news stories together according to topic. Several clustering algorithms can be used to group related stories into clusters. The clustering algorithms used require two types of metrics: metrics that, given a story and a set of clusters, can find the most topical cluster for that story; or metrics that can help decide whether or not a given story is on the same topic as a cluster. These metrics are derived by combining simple similarity metrics that compare stories and groups of stories. Finally, methods are proposed for evaluating the story groupings, and experimental results are reported based on these methods. Thesis Supervisor: Victor Zue Title: Associate Director of LCS and Senior Research Scientist
منابع مشابه
Traffic Scene Analysis using Hierarchical Sparse Topical Coding
Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...
متن کاملSemantic Frame-based Statistical Approach for Topic Detection
We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word inser...
متن کاملRobust Statistical Methods for Automated Outlier Detection
The computational challenge of automating outlier, or blunder point, detection in radio metric data requires the use of nonstandard statistical methods because the outliers have a deleterious effect upon standard least squares methods. The particular nonstandard methods most applicable to the task are the robust statistical techniques that have undergone intense development since the 1960s. The...
متن کاملMonitoring of Social Network and Change Detection by Applying Statistical Process: ERGM
The statistical modeling of social network data needs much effort because of the complex dependence structure of the tie variables. In order to formulate such dependences, the statistical exponential families of distributions can provide a flexible structure. In this regard, the statistical characteristics of the network is provided to be encapsulated within an Exponential Random Graph Model (...
متن کاملCommunity Detection Based on Topic Distance in Social Tagging Networks
Research on the community detection in social tagging networks has attracted much attention in the last decade. Extracting the hidden topic information from tags provides a new way of thinking for community detection in social tagging networks. In this paper, a topic tagging network by extracting several topics from the tags through using the Latent Dirichlet Allocation (LDA) model is built fir...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013